{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Tce3stUlHN0L"
      },
      "source": [
        "##### Copyright 2025 Google LLC."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "cellView": "form",
        "id": "tuOe1ymfHZPu"
      },
      "outputs": [],
      "source": [
        "# @title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "# https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "084u8u0DpBlo"
      },
      "source": [
        "# Voice memos"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wnQ_LVlzIeXo"
      },
      "source": [
        "<a target=\"_blank\" href=\"https://colab.research.google.com/github/google-gemini/cookbook/blob/main/examples/Voice_memos.ipynb\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" height=30/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "q7QvXQMrIhuZ"
      },
      "source": [
        "This notebook provides a quick example of how to work with audio and text files in the same prompt. You'll use the Gemini API to help you generate ideas for your next blog post, based on voice memos you recorded on your phone, and previous articles you've written."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "qLuL9m7KhvxR"
      },
      "outputs": [],
      "source": [
        "%pip install -U -q \"google-generativeai>=0.7.2\""
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "ATIbQM0NHhkj"
      },
      "outputs": [],
      "source": [
        "import google.generativeai as genai"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "l8g4hTRotheH"
      },
      "source": [
        "### Setup your API key\n",
        "\n",
        "To run the following cell, your API key must be stored it in a Colab Secret named `GOOGLE_API_KEY`. If you don't already have an API key, or you're not sure how to create a Colab Secret, see [Authentication](https://github.com/google-gemini/cookbook/blob/main/quickstarts/Authentication.ipynb) for an example."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "d6lYXRcjthKV"
      },
      "outputs": [],
      "source": [
        "from google.colab import userdata\n",
        "GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')\n",
        "genai.configure(api_key=GOOGLE_API_KEY)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Mim5jZB6E1ag"
      },
      "source": [
        "Install PDF processing tools."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "4YjBq1mCE2cX"
      },
      "outputs": [],
      "source": [
        "!apt install poppler-utils"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "MNvhBdoDFnTC"
      },
      "source": [
        "## Upload your audio and text files\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "V4XeFdX1rxaE"
      },
      "outputs": [],
      "source": [
        "!wget https://storage.googleapis.com/generativeai-downloads/data/Walking_thoughts_3.m4a\n",
        "!wget https://storage.googleapis.com/generativeai-downloads/data/A_Possible_Future_for_Online_Content.pdf\n",
        "!wget https://storage.googleapis.com/generativeai-downloads/data/Unanswered_Questions_and_Endless_Possibilities.pdf"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "_HzrDdp2Q1Cu"
      },
      "outputs": [],
      "source": [
        "audio_file_name = \"Walking_thoughts_3.m4a\"\n",
        "audio_file = genai.upload_file(path=audio_file_name)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "MpAU9V3YEvXh"
      },
      "source": [
        "## Extract text from the PDFs"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "NttNb-M_ExlK"
      },
      "outputs": [],
      "source": [
        "!pdftotext A_Possible_Future_for_Online_Content.pdf\n",
        "!pdftotext Unanswered_Questions_and_Endless_Possibilities.pdf"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "LYA-OlD6Gw3v"
      },
      "outputs": [],
      "source": [
        "blog_file_name = \"A_Possible_Future_for_Online_Content.txt\"\n",
        "blog_file = genai.upload_file(path=blog_file_name)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "9Sf63anDG317"
      },
      "outputs": [],
      "source": [
        "blog_file_name2 = \"Unanswered_Questions_and_Endless_Possibilities.txt\"\n",
        "blog_file2 = genai.upload_file(path=blog_file_name2)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "EPPOECHzsIGJ"
      },
      "source": [
        "## System instructions\n",
        "\n",
        "Write a detailed system instruction to configure the model."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "y5zh5WzwGiR5"
      },
      "outputs": [],
      "source": [
        "si=\"\"\"Objective: Transform raw thoughts and ideas into polished, engaging blog posts that capture a writers unique style and voice.\n",
        "Input:\n",
        "Example Blog Posts (1-5): A user will provide examples of blog posts that resonate with their desired style and tone. These will guide you in understanding the preferences for word choice, sentence structure, and overall voice.\n",
        "Audio Clips: A user will share a selection of brainstorming thoughts and key points through audio recordings. They will talk freely and openly, as if they were explaining their ideas to a friend.\n",
        "Output:\n",
        "Blog Post Draft: A well-structured first draft of the blog post, suitable for platforms like Substack or LinkedIn.\n",
        "The draft will include:\n",
        "Clear and engaging writing: you will strive to make the writing clear, concise, and interesting for the target audience.\n",
        "Tone and style alignment: The language and style will closely match the examples provided, ensuring consistency with the desired voice.\n",
        "Logical flow and structure: The draft will be organized with clear sections based on the content of the post.\n",
        "Target word count: Aim for 500-800 words, but this can be adjusted based on user preferences.\n",
        "Process:\n",
        "Style Analysis: Carefully analyze the example blog posts provided by the user to identify key elements of their preferred style, including:\n",
        "Vocabulary and word choice: Formal vs. informal, technical terms, slang, etc.\n",
        "Sentence structure and length: Short and impactful vs. longer and descriptive sentences.\n",
        "Tone and voice: Humorous, serious, informative, persuasive, etc.\n",
        "Audio Transcription and Comprehension: Your audio clips will be transcribed with high accuracy. you will analyze them to extract key ideas, arguments, and supporting points.\n",
        "Draft Generation: Using the insights from the audio and the style guidelines from the examples, you will generate a first draft of the blog post. This draft will include all relevant sections with supporting arguments or evidence, and a great ending that ties everything together and makes the reader want to invest in future readings.\n",
        "\"\"\""
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "CuEXqOnYzhM-"
      },
      "source": [
        "## Generate Content"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "63ec84f8e8a8"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "## The Throwaway Work That Makes You Better\n",
            "\n",
            "Early in my career, I spent a lot of time working on visions, roadmaps, and ideas. Some of them ended up not happening at all, or were essentially thrown away.  It was frustrating, especially coming straight out of school with the idea that you’re given an assignment, you do it, and then you’re graded on it.  There's no \"takesies-backsies\" in school.  You get the assignment, you produce the work, and that’s that. \n",
            "\n",
            "The real world is a lot different. \n",
            "\n",
            "It's a constant adjustment, where priorities change, markets shift, and sometimes the work you do doesn't get used or even goes nowhere.  It felt like a colossal waste of time!\n",
            "\n",
            "It took me a while to get over it, but I don’t think I truly appreciated this until I joined my current team. They have a \"right to think\" culture, and that changed everything. \n",
            "\n",
            "Suddenly, I realized that the work you produce, the content you create, is part of the process of making you better at what you do in the future.  It’s about honing your skills and getting better over time. \n",
            "\n",
            "The \"right to think\" culture isn’t simply about accepting that priorities change and things shift; it's about recognizing that what you produce, even if it's not the final product, is valuable.  It's about learning and iterating.\n",
            "\n",
            "This \"right to think\" idea ties in nicely with iterative processes and learning by doing.  It's a reframing of the mindset around throwaway work.  There's no such thing as throwaway work.  It's all helping you to hone your skills and get better over time. \n",
            "\n",
            "I'm constantly talking about this reframing, and I think it’s important to call out the importance of writing to think. It's about being willing to scrap things and move on once they’ve served their purpose.  Write more, write earlier, write often. Don’t worry about the final product. Just get it out there.\n",
            "\n",
            "In the end, the work you produce, even if it's ultimately discarded, is still part of the process that makes you a better thinker, a better creator, and a better professional. So, don’t be afraid to throw things away. It might just make you better in the long run. \n",
            "\n"
          ]
        }
      ],
      "source": [
        "prompt = \"Draft my next blog post based on my thoughts in this audio file and these two previous blog posts I wrote.\"\n",
        "\n",
        "model = genai.GenerativeModel(model_name=\"models/gemini-2.5-flash\", system_instruction=si)\n",
        "\n",
        "response = model.generate_content([prompt, blog_file, blog_file2, audio_file],\n",
        "                                  request_options={\"timeout\": 600})\n",
        "print(response.text)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "K5oUCqb6IUnH"
      },
      "source": [
        "## Learning more\n",
        "\n",
        "* Learn more about the [File API](https://github.com/google-gemini/cookbook/blob/main/quickstarts/File_API.ipynb) with the quickstart."
      ]
    }
  ],
  "metadata": {
    "colab": {
      "collapsed_sections": [
        "Tce3stUlHN0L"
      ],
      "name": "Voice_memos.ipynb",
      "toc_visible": true
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
